A Novel Approach of Segmenting Touching and Kerned Characters

نویسندگان

  • Yangxing LIU
  • Yupin LUO
  • Fei LIU
  • Zhongqi QIU
چکیده

Character segmentation is a critical step of OCR system. In this paper we discussed segmentation approaches of touching and kerned characters.A non-linear segmentation pathbased algorithm for segmenting touching and kerned characters is put forward. First, touching and kerned characters are extracted and segregated with other characters by using character projections and recognition results.Then in order to find the nonlinear segmentation path of touching and kerned characters, a heuristic method seeking minimalpenalty curved cut is used. Finally several regions belonging to the same character are merged. Experiment results show that the proposed method achieves high segmentation rate in documents written in both English and Japanese characters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fuzzy approach for segmentation of touching characters

The problem of correctly segmenting touching characters is an hard task to solve. In the recent years, many methods and algorithms have been proposed without achieving a comprehensive solution. In this paper, a novel method based on fuzzy logic is studied. The proposed method combines three features of touching characters that in other studies have been exploited one at a time. The strategy is ...

متن کامل

Segmentation of touching characters in printed document recognition

Abstraet--A new discrimination function is presented for segmenting touching characters based on both pixel and profile projections. A dynamic recursive segmentation algorithm is developed for effectively segmenting touching characters. Contextual information and spell checking are used to correct errors caused by incorrect recognition and segmentation. Based on 12 real documents, a maximum 99....

متن کامل

Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi cha...

متن کامل

A Study of Touching Characters in Degraded Gurmukhi Text

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural ...

متن کامل

Multi-oriented touching text character segmentation in graphical documents using dynamic programming

The touching character segmentation problem becomes complex when touching strings are multioriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001